An Optimized Web Feed Aggregation Approach for Generic Feed Types

نویسندگان

  • David Urbansky
  • Sandro Reichert
  • Klemens Muthmann
  • Daniel Schuster
  • Alexander Schill
چکیده

Web feeds are a popular way to access updates for content in the World Wide Web. Unfortunately, the technology behind web feeds is based on polling. Thus, clients ask the feed server regularly for updates. There are two concurrent problems with this approach. First, many times a client asks for updates, there is no new item and second, if the client’s update interval is too large it might be notified too late or even miss items. In this work we present adaptive feed polling algorithms. The algorithms learn from the previous behaviors of feeds and predict their future behaviors. To evaluate these algorithms we created a real set of over 180,000 diversified feeds and collected a dataset of their updates for a time of three weeks. We tested our adaptive algorithms on this set and show that adaptive feed polling reduces traffic significantly and provides near-real-time updates.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing large collections of continuous content-based RSS aggregation queries

In this article we present RoSeS (Really Open Simple and Efficient Syndication), a generic framework for content-based RSS feed querying and aggregation. RoSeS is based on a data-centric approach, using a combination of standard database concepts like declarative query languages, views and multi-query optimization. Users create personalized feeds by defining and composing content-based filterin...

متن کامل

RoSeS: A Continuous Content-Based Query Engine for RSS Feeds

In this article we present RoSeS (Really Open Simple and Efficient Syndication), a generic framework for content-based RSS feed querying and aggregation. RoSeS is based on a data-centric approach, using a combination of standard database concepts like declarative query languages, views and multiquery optimization. Users create personalized feeds by defining and composing content-based filtering...

متن کامل

The Feed Analyzer: Implementation and Evaluation

Modern society is producing more information at a faster pace than ever before; data which is increasingly used to solve a variety of real world problems. But given the vast size of the data involved and the resource intensive nature of rapid large-data processing, the need for more advanced methodologies in this regard is growing. This phenomenon has given rise to the term ‘Big Data’ which ref...

متن کامل

Best-Effort Refresh Strategies for Content-Based RSS Feed Aggregation

During the past several years RSS-based content syndication has become a standard technique for efficiently and timely disseminating information on the web. From a data processing perspective RSS feeds are standard XML resources which are periodically refreshed by feed aggregators for generating continuous streams of items. In this article, we study the problem of information loss in the contex...

متن کامل

Full Data Controlled Web-Based Feed Aggregator

Feed syndication is analogous to electronic newsletters, both are aimed at delivering feeds to subscribers; the difference is that while newsletter subscription requires e-mail and exposed you to spam and security challenges, feed syndication ensures that you only get what you requested for. This paper reports a review on the state of the art of feed aggregation technology and the development o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011